Genome-wide identification and in silico analysis of NPF, NRT2, CLC and SLAC1/SLAH nitrate transporters in hexaploid wheat (Triticum aestivum)

Nitrogen transport is one of the most important processes in plants mediated by specialized transmembrane proteins. Plants have two main systems for nitrogen uptake from soil and its transport within the system—a low-affinity transport system and a high-affinity transport system. Nitrate transporters are of special interest in cereal crops because large amount of money is spent on N fertilizers every year to enhance the crop productivity. Till date four gene families of nitrate transporter proteins; NPF (nitrate transporter 1/peptide transporter family), NRT2 (nitrate transporter 2 family), the CLC (chloride channel family), and the SLAC/SLAH (slow anion channel-associated homologues) have been reported in plants. In our study, in silico mining of nitrate transporter genes along with their detailed structure, phylogenetic and expression analysis was carried out. A total of 412 nitrate transporter genes were identified in hexaploid wheat genome using HMMER based homology searches in IWGSC Refseq v2.0. Out of those twenty genes were root specific, 11 leaf/shoot specific and 17 genes were grain/spike specific. The identification of nitrate transporter genes in the close proximity to the previously identified 67 marker-traits associations associated with the nitrogen use efficiency related traits in nested synthetic hexaploid wheat introgression library indicated the robustness of the reported transporter genes. The detailed crosstalk between the genome and proteome and the validation of identified putative candidate genes through expression and gene editing studies may lay down the foundation to improve nitrogen use efficiency of cereal crops.

transporter to be identified in Arabidopsis 8 . The NRT1 transporter family which has been renamed as NPF family is the largest family of nitrate transporters and can further be classified into eight subfamilies 9 . In Arabidopsis NPF transporters have been well characterized and contain 53 members divided into eight subfamilies 9 . In rice (Oryza sativa) NPF transporters contain 93 members 10 . The majority of NPF transporters are involved in LATS with few exceptions of NRT1.1/NPF6. 3 in Arabidopsis and MtNRT1. 3 in Medicago truncatula, which are involved in both HATS and LATS 11,12 . Although majority of NPFs are involved in nitrate transport, several studies have suggested their role in transport of other substrates such as nitrite 13 , peptides 14 , amino acids 15 and several plant hormones [16][17][18][19][20] . The second family known as NRT2 contains high affinity nitrate transporters. A total of seven NRT2 transporters in Arabidopsis 21 and five NRT2 transporters in rice have been reported 22,23 . Most of NRT2 transporters require a partner protein-NAR2 (nitrate assimilation related protein) to function as high affinity nitrate transporters [22][23][24][25] . Third family of nitrate transporters, CLC (chloride channel) family is mainly associated with vacoular transport of NO 3 −26 . In Arabidopsis, six CLC genes have been reported and are responsible for nitrate and chloride homoeostasis, thereby regulating stomatal movement and salt tolerance [26][27][28] . The fourth family-SLAC1/SLAH (slow type anion channel associated homologs) is anion channel family. In Arabidopsis this family contains four members-SLAC1, SLAH1, SLAH2 and SLAH3 which are involved in the nitrate transport in guard cells and roots and in chloride acquisition 29 . Together these four transporter families are involved in efficient nitrate uptake and utilization in plants.
To the best of our knowledge, the nitrate transporters in hexaploid wheat have not been characterized and explored completely. There are some studies conducted to access the effect of different nitrogen conditions on some of NPF and NRT2 genes 30 . Most of the studies in wheat have been conducted on members of TaNRT2 gene family. Overexpression of TaNRT2.5 has been associated with increased grain nitrate uptake and yield 31 . TaNRT2.1 has been associated with post flowering nitrate uptake in wheat 32 . Expression of TaNRT2.1 can be induced by nitrogen starvation and abscisic acid (ABA) [33][34][35][36][37] . Some phylogenetic studies and expression-based studies have been conducted on NPF and NRT2 genes recently [34][35][36]38 but CLC and SLAC1/SLAH genes still remain uncharacterized. Structure of proteins play very important role in the functionality of transporter proteins but still no studies have been conducted on structure prediction of any of NPF, NRT2, CLC and SLAC1/SLAH genes in wheat. In our study we have identified and characterized genes belonging to all the four families of nitrate transporters. Our analysis includes gene composition, chromosomal location, phylogenetic relations with members of rice and Arabidopsis and expression analysis. We adopted a new nomenclature for identified genes as the earlier nomenclature systems do not include complete information about subgenome and homoeologs. We have classified the genes based on phylogeny and identified homoeologous pairs of the gene. Expression profiles of all the genes were studied for different developmental stages and different tissues. Further the structures of all the members of gene families were investigated.

Methodology
Sequence search and annotation of nitrate transporter genes. Two methods were used for the identification of NRT1, NRT2 genes in wheat. In the first method, the CDD IDs (conserved domain database IDs) specific to TaNPF, TaCLC, TaSLAC/TaSLAH and TaNRT2 genes (Table 1) were used as identifiers to retrieve genes from the wheat reference genome (IWGSC RefSeq V2.0) from the Ensembl Plants (https:// plants. ensem bl. org/ index. html). In the second method, protein sequences were downloaded from the NCBI database using Nitrate/Nitrogen transporters, and NRT as queries. Incomplete, partial sequences, hypothetical, and predicted protein sequences were filtered out. The downloaded sequences were manually curated to remove duplicate sequences and incomplete sequences. The remaining protein sequences (1687 genes) were aligned using Clustal Omega, and the output Stockholm file was used to create the HMMER profile. The HMMER profile was used to search similar protein sequences in the wheat protein database downloaded from IWGSC. A total of 403 high confidence and 38 low confidence proteins were obtained. Separate searches were performed for TaCLC and TaSLAC1/TaSLAH genes using the same method. A total of 41 TaCLC and 43 TaSLAC1/TaSLAH high confidence genes and 10 TaCLC and 7 TaSLAC1/TaSLAH low confidence genes were obtained. The sequences from both the methods were combined, followed by the removal of low confidence proteins and duplicate sequences, and after manual curation, a final set of 412 genes belonging to all four nitrate transporter families were selected. The same methodology was used to identify sequences for Triticum dicoccoides (AABB), T. turgidum (AABB), T. urartu (AA), and Aegilops tauschii (DD) for comparative analysis.
Maximum likelihood phylogeny of nitrate transporter genes. The alignments of TaNRT1/TaNPF, TaCLC, TaSLAC1/TaSLAH and TaNRT2 sequences were created separately using wheat, rice, and Arabidopsis sequences by MAFT (E-INS-I algorithm). The evolutionary history was inferred by using the Maximum Likelihood method and JTT matrix-based model. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbour-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the JTT model and then selecting the topology having superior log-likelihood value. Evolutionary studies were con-Structure prediction of nitrate transporter proteins. Due to the unavailability of crystal structures, gene homology modelling was carried out to predict their three-dimensional (3D) structure. The sequences of TaNRT1, TaCLC, TaSLAC1/TaSLAH and TaNRT2 genes were submitted to web-based server Phyre2 41 . Briefly, Phyre2 used PSI-BLAST to detect sequence homologues which was followed by Psi-pred and Diso-pred to predict secondary structure and disorder. Then Hidden Markov models (HMM) of sequences were generated based on homologues detected before. HMMs of query proteins were scanned against library of HMMs of proteins with experimentally solved structures to construct 3D models of query proteins. Transmembrane helix and topology prediction was carried by memsat-svm 41 . Expression analysis of nitrate transporter genes. The RNAseq data of TaNPF, TaNRT2, TaCLC and TaSLAC1/TaSLAH genes of various tissues (root, shoot/leaf, spike, grain) at three developmental stages (seedling, vegetative and reproductive) for Chinese spring and Azhurnaya (cv) was downloaded from the wheat expression database (www. wheat-expre ssion. com). Expression levels were downloaded as log 2 (transcripts per million) (log 2 tpm) for different tissues at different time points. Several tissue-specific (root, shoot, leaf, grain) genes were identified based on expression patterns. For triad expression analysis, a method described by Ramírez-González et al. 42 was used. Briefly, the expression data from spring wheat (CS) and Azhurnaya was downloaded from the wheat expression database as TPM for root, leave, shoot spike and grain. For analysis, the triads with expression below one tpm were excluded. Expression values were normalized, triads were assigned balanced, A/B/D suppressed or A/B/D dominant profiles. To elucidate the role of Nitrate transporter genes towards N starvation and N recovery, the gene expression data set [34][35][36] from wheat omics 1.0 database (http:// wheat omics. sdau. edu. cn/) was analysed. The dataset contained expression data in roots of 10-day old wheat plants (Chinese Spring) treated for N-starvation for 5 days and then subjected for N-recovery 34-36 . Development of validation panel to check the efficacy of the identified nitrate transporter genes. The nested synthetic hexaploid wheat (N-SHW) introgression library constituting a set of 352 breed-  22  3  6  15  15  7  7   cd17414  NPF4  33  7  12  23  22  9  9   cd17415  NPF3  12  1  5  8  8  3  3   cd17416  NPF1 &2  47  17  13  39  32  17  16   cd17417  NPF5  97  16  32  73  79  31  29   cd17418  NPF8  70  5  16  47  47  20  24   cd17419  NPF7  11  3  4  10  8  5  3 Cd17341 NRT2 43 . These N-SHW library, six parents and two synthetic hexaploid wheats were assessed over 2 years in 2018 and 2019 at 3 nitrogen levels [i.e., zero N (0 kg ha −1 ), half N (60 kg ha -1 ) and full N (recommended, 120 kg ha −1 ]. The detailed phenotyping of the N-SHW introgression libraries for the nitrogen-use efficiency related traits was carried out across years and treatments 43 . High-density genotyping was performed using the 35 K Axiom® Wheat Breeder's Array (Affymetrix UK Ltd., United Kingdom). The population structure of the 352 N-SHW lines was assessed on the basis of 9,474 SNPs distributed across all 21 wheat chromosomes. The most appropriate K explaining the population structure was K = 3 at MAF ≥ 5% ( Supplementary Fig. 4A). The kinship heatmap suggested a weak relatedness in the panel (Supplementary Fig. 4B). The first three principal components (PCs) were most informative gradually decreasing ( Supplementary Fig. 4C,D) until the tenth PC. The kinship and PCs were considered during the GWAS analysis to correct for population structure. The appropriate number of sub-populations was determined from the largest delta K value of 3 ( Supplementary Fig. 4E). The kinship and PCs were considered during the GWAS analysis to identify population structure. Significant marker-trait associations were identified using CMLM (compressed mixed linear model)/P3D (population parameters previously defined) in GAPIT (Genome Association and Prediction Integrated Tool) executed in R. Over 322 marker trait associations for NUE were compared to nitrate transporter genes.

Results
The  (Table 1). TaNPF5 subgroup was the largest group consisting of 97 genes followed by TaNPF8 (70 genes), TaNPF2 (41 genes), TaNPF4 (33 genes), TaNPF6 (22 genes), TaNPF3 (12 genes) and TaNPF7 (11 genes). The NPF1 subgroup was the smallest one consisting of 6 genes present on homoeologous group chromosomes 3A, 3B and 3D. TaNRT1/TaNPF genes were present throughout the genome (Fig. 1). The location of genes across chromosomes varied according to the size of the subfamily. The genes belonging to larger subfamilies (e.g., TaNPF5, TaNPF8, TaNPF2) were predominantly located in tandem positions on the distal region of chromosomes. The genes belonging to smaller subfamilies (TaNPF1, TaNPHF7, TaNPF3) were located on proximal regions of chromosomes. The genes present near distal ends of chromosomes were found to be in the form of clusters in close vicinity to each other. The majority of TaNRT2 genes were present in the clusters on the distal end of homoeologous chromosomes 6A, 6B and 6D. TaCLC genes were distributed across the wheat genome. TaSLAC1/TaSLAH genes were only distributed on homoeologous chromosomes 1A,1B, 1D, 2A, 2B, 2D, 3A, 3B and 3D. The predicted gene structures contained several intron regions ( Supplementary Fig. 1a-c) for many genes in TaNPF, TaCLC and TaSLAC1/TaSLAH families. All the TaNRT2 genes were intron less. The size of predicted genes ranged between 1 and 25 Kb. Several truncated and duplicated genes were also predicted.
Phylogenetic relationships among nitrate transporter genes. The maximum likelihood phylogenetic tree of all the nitrate transporter genes predicted that wheat contains all the major subfamilies present in Arabidopsis and rice (Oryza sativa) (Fig. 2a). The TaNRT1/TaNPF and TaNRT2 genes could be classified into five subclades. The subclades in the phylogenetic tree followed species phylogeny with Arabidopsis genes displaying sister group relationship with wheat genes. Based on the phylogenetic relationship, TaNRT1/TaNPF genes fitted well into eight subfamilies (TaNPF1 to TaNPF8) following the Arabidopsis model. The topology of larger subclades (TaNPF5, TaNPF8, TaNPF2) was more complex than smaller subclades as they were more expanded in wheat than Arabidopsis and rice (Fig. 2a, Supplementary Fig. 2). TaNRT2 genes were present as a separate subclade and were closely related to the TaNPF2 subfamily. The phylogenetic analysis of TaCLC and TaSLAC1/TaSLAH genes was carried out separately. The results showed TaCLC genes could be classified into 6 groups according to phylogenetic relation with Arabidopsis and rice genes (Fig. 2b). TaSLAC1/TaSLAH genes were divided into 4 subclades. The largest subclade in TaSLAC1/TaSLAH genes showed close relationship with rice SLAC1/SLAH genes but not with Arabidopsis genes (Fig. 2c).

TraesC-SU02G001400 TaSLAC-1B9 TaSLAC-1D9 TaSLAC-Un3
TaSLAC-T11 TraesCS1B02G456300 TraesC-S1D02G433200 TraesC-SU02G001600 TaSLAC-1B10 TaSLAC-1D10 TaSLAC-Un4  Table 3, Supplementary Fig. 3). The remaining genes showed very low or no expression (tpm < 1). Overall, we identified 20 triads in which 48 genes were showing tissue specific expression, out of which 8 triads were root specific, 5 triads were leaf/shoot specific and 7 triads were showing grain/ spike specific expression (Supplementary table 4). Tissue and developmental stage-specific expression were observed in TaNPF1 genes, which were only expressed in spike and grain at the reproductive stage (Fig. 5A). Similarly, TaNRT2 genes were predominantly expressed in roots in both vegetative and reproductive stages (Fig. 5A). TaSLAC1/TaSLAH genes were predominately expressed in roots and leaves with some genes showing expression in spikes also (Fig. 5B). TaCLC genes showed mostly ubiquitous expression (Fig. 5B). For the rest of the subfamilies, the genes within one subfamily differed considerably in their expression patterns. In TaNPF2 genes, spike/grain specific (3 genes), leaf, spike and grain specific (5 genes) and ubiquitous expression (6 genes) were observed (Fig. 5A). TaNPF3 genes showed spike/grain, leaf specific expression, TaNPF4 genes showed leaf/root-specific (4 genes) and ubiquitous expression (10 genes) (Fig. 5A). TaNPF5 and TaNPF8 genes mostly showed ubiquitous expression though the root-specific expression was observed in a few genes (Fig. 5A). TaNPF6 showed ubiquitous (6 genes), leaf and root-specific (6 genes), spike specific (3 www.nature.com/scientificreports/ genes) and root-specific expression (Fig. 5A). TaNPF7 showed ubiquitous expression in three genes, grain specific expression in two genes and root-specific expression in one gene (Fig. 5A). To find out up to what extent homoeologs differ in the expression patterns, triad expression analysis was performed. Most of the triads showed balanced expression ranging from 55.6 to 65.2% in all the tissues (Fig. 6A). In roots, a total of 54 triads were showing expression out of total 83 triads. Out of which 55.6% showed balanced expression, 18.5% showed A suppressed, 11.1% showed D suppressed, 9.3% showed B suppressed expression. Three triads showed A, B and D dominant expression (1 each) (Fig. 6B). In leaf/shoot out of 51 triads, 64.7% showed balanced expression, 9.8% showed A suppressed and B suppressed each, 3.9% triads showed D suppressed expression. 5.8% triads showed A and D dominant expression each while no B dominant expression was observed (Fig. 6B). In spikes, 61.9% triads out of 42 triads showed balanced expression. Only D dominant expression was observed in 9.5% of triads while A suppressed, B suppressed, and D suppressed expressions were in about 16.7, 7.1% 4.7% triads (Fig. 6B). Only 23 triads were expressing in grains at the reproductive stage, out of which 65.2% showed balanced expression, 8.7% triads showed A, B, and D suppressed each and 4.3% triads showed B and D dominant expression (Fig. 6B).
Nitrate transporter genes are located in close proximity to the NUE associated SNPs. In a parallel study in our laboratory, the nested synthetic wheat introgression (N-SHW) libraries capturing novel genetic variation from wild wheat for the nitrogen use efficiency related traits were developed and genotyped using a high-density SNP array 43 . These libraries were phenotypically assessed for the root traits and agronomic performance under three nitrogen input conditions (N: 0 kg ha −1 ; N: 60 kg ha −1 and N:120 kg ha −1 ) in the field over two years in 2018 and 2019. Genome-wide association mapping was used to identify marker-trait associations for the root and agronomic traits to identify the marker-trait associations for traits improving nitrogen use efficiency in wheat (Supplementary Table 5). We compared 322 marker trait associations for NUE identified in this study 43 to nitrate transporter genes identified during genome wide analysis. We identified 67 SNPs, which were in close proximity to nitrate transporter genes in the wheat genome. A total of 93 nitrate transporter genes could be located near NUE linked SNPs, out of which, 63 genes belonged to TaNPF family, 15 genes belonged to TaNRT2 family, 11 genes belonged to TaCLC and 4 genes belonged to TaSLAC1/TaSLAH family (Table 4  www.nature.com/scientificreports/ Response of nitrate transporter genes during N-starvation and N-recovery. The response of all N transporter genes towards N starvation and N recovery was analysed from WheatOmics database [34][35][36]47,48 . The results suggested that the expression of N transporter genes towards N starvation and N recovery was variable. We specifically identified the genes whose expression patterns changed significantly in response to N starvation or N recovery. The expression values of TaNPF1 and TaNPF3 genes were not significant (Fig. 7A,C). Three genes in TaNPF2 showed increased expression in N starvation and their expression values returned to normal during N recovery (Fig. 7B). The expression values of most of TaNPF5 genes were slightly reduced during N starvation and increased significantly during N recovery (Fig. 7E,F). TaNPF6 genes expression reduced during both N starvation and N recovery (1 h) but their expression returned to normal 24 h after recovery (Fig. 7G). The expression of most of TaNPF7 genes was upregulated during N starvation and N recovery (1 h) and downregulated after 24 h of N recovery (Fig. 7H). The expression of TaNPF4 and TaNPF8 genes was variable (Fig. 7D,I,J). The expression of most of TaNRT2 and TaCLC genes was upregulated during N recovery (1 h) phase (Fig. 7K,L,M,N). The expression values of some TaSLAC1/TaSLAH genes were reduced in response to N starvation and increased during N recovery (24 h) (Fig. 7O,P). Specifically looking into the expression pattern of 93 genes in close proximity of NUE associated SNPs, we could identify 32 genes whose expression pattern changed in response to N starvation and N recovery ( Supplementary Fig. 6, Supplementary Table 6). These genes can serve as candidate genes and may be further utilized in genomics-assisted breeding programs targeting improved nitrogen-use efficiency in wheat.

Discussion
The main aim of this study was to identify and analyse nitrate transporters belonging to all the four families and study their dynamics in wheat. The number of nitrate transporter genes detected in wheat was higher as compared to other plant species. This could be explained by a large genome (~ 18 Gb) and hexaploid nature of wheat. Presence of three homoeologous sub-genomes in wheat could allow multiple copies of nitrate transporters resulting in higher number of transporter genes. When comparing with diploid progenitors (Ae. tauschii and T. urartu) and tetraploid wheats (T. dicoccoides and T. turgidum) the number of genes in each subfamily were approximately proportional ( Table 1). The genes were distributed randomly in the genome except for TaNRT2 genes which were predominantly present on group 6 homoeologous chromosome. Many genes were present in form of clusters and showed high percentage of similarity indicating gene-duplication events. There were genes with deleted segments present in the genome. The phylogenetic relationships with orthologues in other plants could be used to classify the genes in subfamilies. All the major subclades were conserved in wheat in comparison to other plant species indicating biological importance of the subfamilies. Based on phylogeny the genes could be grouped in homoeologous triads. Almost 73% of the genes could be assigned to 1:1:1 homoeologous groups which is very much above the average homoeologous retention rate (35.8%) in wheat (IWGSC 2018). Many genes were also grouped into tetrads and diads based on homology indicating gene duplication and deletion events in the genome. The overall results revealed that wheat nitrogen transporter families are much more complex than in other plant species. This complexity arises mostly due to presence of three sub-genomes (A B D) and gene duplication and deletion events. The complexity of wheat genome also affects the expression patterns of genes. Due to presence of multiple sets of homoeologs on A, B and D genomes the buffering effects are observed in expression of genes. To study up to what extent these interactions affect the expression of nitrate transporters, triad expression analysis was performed. More than 55% of genes showed balanced expression in all the tissues which is comparable to genome-wide assessment of all transcripts in wheat 42 . The expression profiles of the genes identified in this study were in accordance to the previous studies in other plants. The expression patterns of nitrate transporter genes were similar to expression patterns of close orthologs in rice and Arabidopsis indicating the conservation of gene functions. CLC genes in previous studies in Arabidopsis showed ubiquitous expression which was observed in this study for wheat as well 27,28 . Several tissue specific nitrate transporter genes were identified which can be targeted for gene manipulation for wheat improvement. Several TaNRT2 and TaSLAC1/TaSLAH genes showed root specific expression suggesting their role in root nitrate uptake. Root specific expression of NRT2 and TaSLAC1/ TaSLAH genes has already been reported in rice and Arabidopsis 29,49 . TaNPF1 genes and some TaSLAC1/SLAH genes showed grain and spike specific expression suggesting their role in nitrate transfer in developing seeds.
Structure plays a very important role in the function of transporter proteins. X-ray crystallographic structures of eukaryotic nitrate transporters have been elucidated 50 . All the nitrate transporter families belong to a much larger major facilitator superfamily (MFS) according to transporter classification database 51 . All the nitrate transporter proteins were predicted to have a typical MFS protein structure with multiple TMs. To the best of our knowledge our study is the first one to report homology-based models of nitrate transporter proteins belonging to all four families in wheat. The number of transmembrane segments play very important role in the optimal functioning MFS transporter proteins 52 . For an MFS transporter protein to have optimal transport properties pseudosymmetry is important which is provided by even number of TMs 50 . According to previous studies most of MFS proteins required 12 TMs to have optimal function 53 . In our study we predicted nitrate transporter families having variation in the number of TMs. TaNPF family being the largest of all showed most variation in the number of TMs with number ranging from 12 to 14. Several proteins with odd number of TMs were also observed. For example, all the members of TaNPF1 subfamily contain 13 TMs. All TaNRT2 proteins were highly conserved and contained 12 TMs. Most of the TaCLC and TaSLAC1/TaSLAH genes contained only 10 TMs. The variation in number of TMs between and within subfamilies and presence of odd number of TMs could not be corelated with expression data suggesting that a much more flexible criteria exists for the function of nitrate transporter proteins. The structural information presented in this study offer foundation for future work to identify molecular mechanisms responsible for functioning of nitrate transporters in wheat. www.nature.com/scientificreports/ Previously in many studies overexpression of nitrate transporter genes has been linked to improved nitrogen use efficiency and yield in many plants 54-57 and 58 . Overexpression of OsNRT2.1, OsNRT2.3b, OsNPF6.3 in rice and ZmNRT1.1A in maize has resulted in increased grain yield 25,[34][35][36]57,57,59 . In wheat TaNRT2.1 is reported to be involved in post-flowering N uptake 32 and is an important gene for improvement of nitrogen use efficiency. The CLC genes have been reported to be involved in nitrate accumulation in plants 26 and many CLC genes have been reported to have role in stress responses. SLAC1 is a key player in regulation of stomatal closure. SLAH genes are involved in root nitrate and chloride acquisition and translocation to shoot. SLAC1/SLAH genes have also been reported to have important role in drought responses 49 . The genome wide analysis of TaCLC and TaSLAC1/TaSLAH genes in this study is the first reported study of these genes in wheat to the best of our knowledge. Nitrate transporters identified in this study can be promising candidates for gene manipulation to enhance productivity and nitrogen use efficiency in wheat. The identification of nitrate transporter genes in the close proximity to the marker-traits associations indicated the robustness of genome wide association mapping studies and the reliability of the reported transporter genes. The identified nitrate transporters could deepen the Table 3. Number of triads, tetrads, diads and singletons detected in nitrate transporter families in hexaploid wheat genome.     www.nature.com/scientificreports/ 17 grain/spike specific putative candidate genes. The identification of nitrate transporter genes in the close proximity to the previously identified 67 marker-traits associations associated with the nitrogen use efficiency related traits in nested synthetic hexaploid wheat introgression library 43 indicated the robustness of the reported transporter genes. The detailed crosstalk between the genome and proteome and the validation of identified putative candidate genes through expression and gene editing studies may lay down the foundation to improve nitrogen use efficiency of cereal crops. The existing genetic variability for 48 tissue specific genes and 93 genes in close proximity to NUE associated SNPs identified in the present study in different wild and cultivated wheat accessions/varieties may be further utilized in genomics-assisted breeding programs targeting improved nitrogen-use  www.nature.com/scientificreports/ efficiency in wheat. A total of 32 genes out of these 93 genes show significant changes in expression patterns in response to N starvation and/ or N recovery suggesting their involvement in N uptake and assimilation. These genes can serve as initial candidates for targeting N use efficiency in wheat. The identification of improved breeding lines or the wild accessions possessing the potential nitrate transporters may serve as novel donors to be used in genomics-assisted introgression program developing nitrogen-efficient wheat varieties. The identified nitrate transporters may have potential for efficient nitrogen uptake and its transport from source to sink. Once validated, the candidate genes may further be deployed in genomics-assisted breeding program to develop nutrient efficient wheat varieties. The present study provides important information on potential nitrate transporters that may lay foundation to develop a new breeding strategy for the sustainable agricultural development of cereal crops with less input-more output and the environmental protection. The identified nitrate transports may be of great significance both in the theory and in the genomics-assisted breeding application  .

Data availability
All data used in this research are included in this published article and its supplementary information files.